Background

I am working with two datasets. The first contains cost of living information for major United States cities in 2018. It has 132 cities and two corresponding values. The first is called ‘rent index’ this index is relative to New York City. For New York City, all indecies are 100. For example, if another city has a rent index of 120, this means on average that cities rent is 20% more expensive than New York City. The other value is cost of living which is the

Cleaning

Prep work on cost_of_living data and connected it to lat and long data for mapping.

cost = cost[-1]
cost = separate(cost, City, into = c("city", "state","country"), sep = ',') 
cost = cost[-3]
cost = separate(cost, state, into = c("delete", "state"), sep = ' ')
cost = cost[-2]
cost$state = abbr2state(cost$state)
convert = convert[,c(1,4,9,10)]
convert$state = convert$state_name
convert = convert[-2]
cost_map = left_join(cost,convert,by = c("city","state"))
str(cost_map)
## 'data.frame':    132 obs. of  6 variables:
##  $ city                : chr  "New York" "San Francisco" "Anchorage" "Honolulu" ...
##  $ state               : chr  "New York" "California" "Alaska" "Hawaii" ...
##  $ Cost.of.Living.Index: num  100 97.8 95 94.2 93.8 ...
##  $ Rent.Index          : num  100 115.4 40.1 62.8 76.2 ...
##  $ lat                 : num  40.7 37.8 61.2 21.3 40.7 ...
##  $ lng                 : num  -73.9 -122.4 -149.1 -157.8 -73.9 ...

Rent Index

Cost of living

Comparison

Min wage

References